Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 4.198
Filtrar
1.
World J Urol ; 42(1): 250, 2024 Apr 23.
Artigo em Inglês | MEDLINE | ID: mdl-38652322

RESUMO

PURPOSE: To compare ChatGPT-4 and ChatGPT-3.5's performance on Taiwan urology board examination (TUBE), focusing on answer accuracy, explanation consistency, and uncertainty management tactics to minimize score penalties from incorrect responses across 12 urology domains. METHODS: 450 multiple-choice questions from TUBE(2020-2022) were presented to two models. Three urologists assessed correctness and consistency of each response. Accuracy quantifies correct answers; consistency assesses logic and coherence in explanations out of total responses, alongside a penalty reduction experiment with prompt variations. Univariate logistic regression was applied for subgroup comparison. RESULTS: ChatGPT-4 showed strengths in urology, achieved an overall accuracy of 57.8%, with annual accuracies of 64.7% (2020), 58.0% (2021), and 50.7% (2022), significantly surpassing ChatGPT-3.5 (33.8%, OR = 2.68, 95% CI [2.05-3.52]). It could have passed the TUBE written exams if solely based on accuracy but failed in the final score due to penalties. ChatGPT-4 displayed a declining accuracy trend over time. Variability in accuracy across 12 urological domains was noted, with more frequently updated knowledge domains showing lower accuracy (53.2% vs. 62.2%, OR = 0.69, p = 0.05). A high consistency rate of 91.6% in explanations across all domains indicates reliable delivery of coherent and logical information. The simple prompt outperformed strategy-based prompts in accuracy (60% vs. 40%, p = 0.016), highlighting ChatGPT's limitations in its inability to accurately self-assess uncertainty and a tendency towards overconfidence, which may hinder medical decision-making. CONCLUSIONS: ChatGPT-4's high accuracy and consistent explanations in urology board examination demonstrate its potential in medical information processing. However, its limitations in self-assessment and overconfidence necessitate caution in its application, especially for inexperienced users. These insights call for ongoing advancements of urology-specific AI tools.


Assuntos
Avaliação Educacional , Urologia , Taiwan , Avaliação Educacional/métodos , Competência Clínica , Humanos , Conselhos de Especialidade Profissional
3.
Am J Orthod Dentofacial Orthop ; 165(4): 383-384, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38402482

RESUMO

As a specialty board, the American Board of Orthodontics (ABO) serves to protect the public and the orthodontic specialty by certifying orthodontists. The demonstration of commitment to lifelong learning and self-improvement is critical to achieving the highest level of patient care. The ABO completed a practice analysis study in 2023 to ensure all examinations represent current assessments of proficiency in orthodontics at a level of quality that satisfies professional expectations. The practice analysis is essential to providing a demonstrable relationship between the examination content and orthodontic practice and provides a critical foundation for ABO's examination programs.


Assuntos
Ortodontia , Humanos , Estados Unidos , Conselhos de Especialidade Profissional , Ortodontistas , Assistência Odontológica
4.
J Surg Educ ; 81(4): 578-588, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38402095

RESUMO

OBJECTIVE: The goals of this study were (1) to assess if examiner ratings in the American Board of Surgery (ABS) General Surgery Cetifying Exam (CE) are biased based on the gender, race, and ethnicity of the candidate or the examiners, and (2) if the format of delivering of the exams, in-person or virtual, affects how examiners rate candidates. DESIGN: We included every candidate-examiner combination for first time takers of the general surgery oral exam. Total scores and pass/fail outcomes based on the 4 scores given by examiners to candidates were analyzed using multilevel models, with candidates as random effects. Explanatory variables included the gender, race, and ethnicity of candidates and examiners, and the format of the exam (in-person or virtual). Candidates' first attempt scores on the ABS General Surgery Qualifying Exam (QE) were also included in the models to control for the baseline knowledge of the candidate. Three sets of models were evaluated for each demographic variable (gender, race, ethnicity) due to missingness in data. p-values and coefficients of determination R2 were used to quantify the statistical and practical significance of the model coefficients (an existent relationship between the explored variables on CE scores was considered statistically and practically significant if the p-value was lower than 0.01 and R2 higher than 1%). PARTICIPANTS: All first-time takers of the American Board of Surgery General Surgery Certifying Exam from 2016 to 2022 that had demographic data, and the examiners that participated in those exams. RESULTS: The number of candidates/examiners for the 3 sets of models was 8665/514 (gender), 5906/465 (race), and 4678/295 (ethnicity). The demographic variables, format of the exam, or their interactions were not found to significantly relate to examiner-candidate ratings or pass/fail outcomes. The only variable that was significantly related to CE scores was candidates' QE scores, which was added to the models as a measure of candidates' initial knowledge; this held for all models for total scores (F[1,8659] = 1069.89, p-value < 0.01, R2 = 5% [gender models], F(1,5696.3) = 589.13, p-value < 0.01, R2 = 5% [race models], F(1,4459.5) = 278.33, p-value < 0.01, R2 = 5% [ethnicity models]), and pass/fail outcomes (CI = 1.61-1.73, p-value < 0.01, R2 = 3% [gender models], CI = 1.67-1.85, p-value < 0.01, R2 = 3% [race models], CI = 2.17-2.90, p-value < 0.01, R2 = 3% [ethnicity models]). CONCLUSIONS: This study shows that there is not a relationship between candidate and examiner gender, race, or ethnicity, and exam outcomes based on statistical models looking at examiner-candidate ratings and pass/fail outcomes. In addition, the delivery of the certifying exam in a virtual format appears to have no statistical impact on outcomes compared to in-person delivery. This suggests that the ABS is performing well in both demographic bias and virtual space.


Assuntos
Certificação , Cirurgia Geral , Humanos , Estados Unidos , Conselhos de Especialidade Profissional , Avaliação Educacional , Etnicidade , Cirurgia Geral/educação , Competência Clínica
5.
JAMA Intern Med ; 184(4): 349-350, 2024 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-38345810

RESUMO

This essay shines a light on structural bias inherent to the board certification examination process, sharing the author's experience preparing and sitting for the examination while contending with co-occurring challenging life events.


Assuntos
Certificação , Conselhos de Especialidade Profissional , Humanos , Estados Unidos , Exame Físico , Avaliação Educacional , Competência Clínica
6.
Clin Exp Nephrol ; 28(5): 465-469, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38353783

RESUMO

BACKGROUND: Large language models (LLMs) have impacted advances in artificial intelligence. While LLMs have demonstrated high performance in general medical examinations, their performance in specialized areas such as nephrology is unclear. This study aimed to evaluate ChatGPT and Bard in their potential nephrology applications. METHODS: Ninety-nine questions from the Self-Assessment Questions for Nephrology Board Renewal from 2018 to 2022 were presented to two versions of ChatGPT (GPT-3.5 and GPT-4) and Bard. We calculated the correct answer rates for the five years, each year, and question categories and checked whether they exceeded the pass criterion. The correct answer rates were compared with those of the nephrology residents. RESULTS: The overall correct answer rates for GPT-3.5, GPT-4, and Bard were 31.3% (31/99), 54.5% (54/99), and 32.3% (32/99), respectively, thus GPT-4 significantly outperformed GPT-3.5 (p < 0.01) and Bard (p < 0.01). GPT-4 passed in three years, barely meeting the minimum threshold in two. GPT-4 demonstrated significantly higher performance in problem-solving, clinical, and non-image questions than GPT-3.5 and Bard. GPT-4's performance was between third- and fourth-year nephrology residents. CONCLUSIONS: GPT-4 outperformed GPT-3.5 and Bard and met the Nephrology Board renewal standards in specific years, albeit marginally. These results highlight LLMs' potential and limitations in nephrology. As LLMs advance, nephrologists should understand their performance for future applications.


Assuntos
Nefrologia , Autoavaliação (Psicologia) , Humanos , Avaliação Educacional , Conselhos de Especialidade Profissional , Competência Clínica , Inteligência Artificial
8.
J Surg Educ ; 81(2): 226-242, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38195275

RESUMO

PURPOSE: Medical Knowledge for general surgery residents' is assessed by the American Board of Surgery In- Training Examination (ABSITE). ASBITE score reports contain many metrics residency directors can utilize to assess resident progress and perform program evaluation. The purpose of this study was to develop a framework to evaluate program effectiveness in teaching specific subtest and subtopic areas of the ABSITE, using ABSITE score reports as an indicator. The aim is to demonstrate the identification of topic areas of weakness in program-wide performance on the ABSITE to guide proposed modification of the general surgery residency program curriculum, and to initiate development of a data visualizing dashboard to communicate these metrics. METHODS: A single institution retrospective study was performed utilizing ABSITE score reports from general surgery residents at a large academic training program from 2017 to 2020. ABSITE performance metrics from 320 unique records were entered into a database; statistical analysis for linear trends and variance were conducted for standard scores, subtest standard scores, and incorrect subtest topics. Deviation from national average scores were calculated by subtracting the national average score from each subtest score for each trainee. Data were displayed as medians or proportions and are displayed to optimize visualization as a proof-of-concept for the development of a program dashboard. RESULTS: Trends and variance in general surgery program and cohort performance on various elements of the ABSITE were visualized using figures and tables that represent a prototype for a program dashboard. Figure A1 demonstrates one example, in which a heatmap displays the median deviation from national average scores for each subtest by program year. Boxplots show the distribution of the deviation from national average, range for national average scores, and the recorded scores for each subtest by program year. Trends in median deviation from the national average scores are displayed for each program year paneled by subtest or for each exam year paneled by cohort. Median change in overall test scores from one program year to another in a cohort is visualized as a table. Bar graphs show the most often missed topics across all program years and heatmaps were generated showing the proportion of times each topic was missed for each subtest and exam year. CONCLUSIONS: We demonstrate use of ABSITE reports to identify specific thematic areas of opportunities for curriculum modification and innovation as an element of program evaluation. In this study we demonstrate, through data analysis and visualization, feasibility for the creation of a Program ABSITE Dashboard (PAD) that enhances the use of ABSITE reports for formative program evaluation and can guide modifications to surgery program curriculum and educational practices.


Assuntos
Cirurgia Geral , Internato e Residência , Humanos , Estados Unidos , Educação de Pós-Graduação em Medicina , Conselhos de Especialidade Profissional , Estudos Retrospectivos , Avaliação Educacional , Currículo , Cirurgia Geral/educação
10.
J Contin Educ Health Prof ; 44(1): 2-10, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-36877811

RESUMO

INTRODUCTION: Evidence links assessment to optimal learning, affirming that physicians are more likely to study, learn, and practice skills when some form of consequence ("stakes") may result from an assessment. We lack evidence, however, on how physicians' confidence in their knowledge relates to performance on assessments, and whether this varies based on the stakes of the assessment. METHODS: Our retrospective repeated-measures design compared differences in patterns of physician answer accuracy and answer confidence among physicians participating in both a high-stakes and a low-stakes longitudinal assessment of the American Board of Family Medicine. RESULTS: After 1 and 2 years, participants were more often correct but less confident in their accuracy on a higher-stakes longitudinal knowledge assessment compared with a lower-stakes assessment. There were no differences in question difficulty between the two platforms. Variation existed between platforms in time spent answering questions, use of resources to answer questions, and perceived question relevance to practice. DISCUSSION: This novel study of physician certification suggests that the accuracy of physician performance increases with higher stakes, even as self-reported confidence in their knowledge declines. It suggests that physicians may be more engaged in higher-stakes compared with lower-stakes assessments. With medical knowledge growing exponentially, these analyses provide an example of the complementary roles of higher- and lower-stakes knowledge assessment in supporting physician learning during continuing specialty board certification.


Assuntos
Certificação , Médicos , Humanos , Estudos Retrospectivos , Aprendizagem , Conselhos de Especialidade Profissional , Competência Clínica
11.
Ann Surg ; 279(1): 187-190, 2024 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-37470170

RESUMO

OBJECTIVE: Historically, the American Board of Surgery required surgeons to pass the qualifying examination (QE) before taking the certifying examination (CE). However, in the 2020-2021 academic year, with mitigating circumstances related to COVID-19, the ABS removed this sequencing requirement to facilitate the certification process for those candidates who were negatively impacted by a QE delivery failure. This decoupling of the traditional order of exam delivery has provided a natural comparator to the traditional route and an analysis of the impact of examination sequencing on candidate performance. METHODS: All candidates who applied for the canceled July 2020 QE were allowed to take the CE before passing the QE. The sample was then reduced to include only first-time candidates to ensure comparable groups for performance outcomes. Logistic regression was used to analyze the relationship between the order of taking the QE and the CE, controlling for other examination performance, international medical graduate status, and gender. RESULTS: Only first-time candidates who took both examinations were compared (n=947). Examination sequence was not a significant predictor of QE pass/fail outcomes, OR=0.54; 95% CI, 0.19-1.61, P =0.26. However, examination sequence was a significant predictor of CE pass/fail outcomes, OR=2.54; 95% CI, 1.46-4.68, P =0.002. CONCLUSIONS: This important study suggests that preparation for the QE increases the probability of passing the CE and provides evidence that knowledge may be foundational for clinical judgment. The ABS will consider these findings for examination sequencing moving forward.


Assuntos
Cirurgia Geral , Internato e Residência , Cirurgiões , Estados Unidos , Humanos , Conselhos de Especialidade Profissional , Avaliação Educacional , Certificação , Modelos Logísticos , Cirurgia Geral/educação , Competência Clínica
16.
Tog (A Coruña) ; 20(2): 138-140, Nov 30, 2023.
Artigo em Espanhol | IBECS | ID: ibc-228907

RESUMO

Conocer la historia, como en cualquier ámbito de la vida, es necesario para saber no sólo de dónde venimos sino comprender el porqué de la actualidad. También nos sirve de guía, como proyección hacia un futuro mucho mejor para nuestra profesión y, por ende, para las personas que podamos encontrar en el camino. Gracias a dicha historia, desde el COPTOA planteamos una necesidad de hablar de la terapia ocupacional a otros profesionales, a posibles beneficiarios, a los gestores de salud, educación o servicios sociales. O como dice el lema de este año de la Federación Mundial de Terapeutas Ocupacionales “Unidad a través de la comunidad”.(AU)


Knowing history, as in any area of life, is necessary to know not only where we come from but to understand why we are today. It also serves as a guide for us, as a projection towards a much better future for our profession and, therefore, for the people we may meet along the way. Thanks to this history, at COPTOA we raise the need to talk about occupational therapy to other professionals, to possible beneficiaries, to health, education or social services managers. Or as this year's motto of the World Federation of Occupational Therapists says, “Unity through Community”.(AU)


Assuntos
Humanos , Masculino , Feminino , Terapia Ocupacional/história , Participação Social , Participação da Comunidade , Sociedades , Conselhos de Especialidade Profissional
18.
JAMA ; 330(14): 1329-1330, 2023 10 10.
Artigo em Inglês | MEDLINE | ID: mdl-37738250

RESUMO

This Viewpoint examines the demands of maintenance of certification (MOC) requirements from the ABIM on balance with the projected benefits to quality of patient care.


Assuntos
Competência Clínica , Conselhos de Especialidade Profissional , Certificação/normas , Competência Clínica/normas , Educação Médica Continuada/normas , Conselhos de Especialidade Profissional/normas , Estados Unidos
19.
JAMA Pediatr ; 177(9): 977-979, 2023 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-37459084

RESUMO

This Diagnostic/Prognostic Study evaluates the performance of a large language model in generating answers to practice questions for the neonatal-perinatal board examination.


Assuntos
Certificação , Conselhos de Especialidade Profissional , Recém-Nascido , Humanos , Idioma
20.
Acad Med ; 98(10): 1104-1106, 2023 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-37406286

RESUMO

Across the medical profession there is broad acceptance of the critical role of continuing medical education (CME) in enabling physicians to adapt to both new information and evolving expectations within the profession. In the presence of widespread participation in CME, some have attempted to question, discredit, or marginalize the role of ongoing lifelong assessment of physician knowledge and skills through specialty continuing certification, advocating instead for a participatory standard based only on engagement with CME. This essay outlines the limitations of physician self-evaluation and clarifies the need for external assessments. Certification boards' role is to set specialty-specific standards for competence, assess to those standards, and assure the public that certified physicians are adequately maintaining their skills and abilities; doing so credibly necessarily requires, in part, independent assessments of physician competence. In these contexts, the specialty boards are taking approaches to identify performance gaps and leverage intrinsic motivation to facilitate physician engagement in targeted learning. Specialty board continuing certification plays a unique role, distinct from and complementary to the CME enterprise. Calls to eliminate continuing certification requirements beyond self-directed CME are contradictory to the evidence and fail the profession and the public.


Assuntos
Competência Clínica , Medicina , Humanos , Estados Unidos , Certificação , Conselhos de Especialidade Profissional , Educação Médica Continuada
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...